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Abstract. This article is concerned with the integration of the time-dependent Ginzburg- 
Landau (TDGL) equations of superconductivity. Four algorithms, ranging from fully explicit to fully 
implicit, are presented and evaluated for stability, accuracy, and compute time. The benchmark 
problem for the evaluation is the equilibration of a vortex configuration in a superconductor that is 
embedded in a thin insulator and subject to an applied magnetic field. 
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1. Introduction. At the macroscopic level, the state of a superconductor can 
be described in terms of a complex- valued order parameter and a real vector potential. 
These variables, which determine the superconducting and electromagnetic properties 
of the system at equilibrium, are found as solutions of the Ginzburg-Landau (GL) 
equations of superconductivity. Because they correspond to critical points of the GL 
energy functional 0, they can, at least in principle, be determined by minimizing 
a functional. In practice, one introduces a time-like variable and computes equilib- 
rium states by integrating the time-dependent Ginzburg-Landau (TDGL) equations. 
The TDGL equations, first formulated by Schmid[3| and subsequently derived from 
microscopic principles by Gor'kov and Eliashberg J4|, are nontrivial generalizations of 
the (time-independent) GL equations, as the time rate of change must be introduced 
in such a manner that gauge invariance is preserved at all times. The TDGL equa- 
tions have been analyzed by several authors; see, for example, the articles J^, [| and 
the references cited there. 

We are interested, in particular, in vortex solutions of the GL equations. These 
are singular solutions, where the phase of the order parameter changes by 2n along 
any closed contour surrounding a vortex point. Vortices are of critical importance in 
technological applications of superconductivity. 

Computing vortex solutions of the GL equations by integrating the TDGL equa- 
tions to equilibrium has the advantage that the solutions thus found are stable. At 
the same time, one obtains information about the transient behavior of the system. 
Integrating the TDGL equations to equilibrium is, however, a time-consuming pro- 
cess requiring considerable computing resources. In simulations of vortex dynamics in 
superconductors, which were performed on an IBM SP with tens of processors in par- 
allel, using a simple one-step Euler integration procedure, we routinely experienced 
equilibration times on the order of one hundred hours @, |[ Incremental changes 
would gradually drive the system to lower energy levels. These very long equilibration 
times arise, of course, because we are dealing with large physical systems undergoing a 
phase transition. The energy landscape for such systems is a broad, gently undulating 
plain with many shallow local minima. It is therefore important to develop efficient 
integration techniques that remain stable and accurate as the time step increases. 
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In this article we present four integration techniques ranging from fully explicit 
to fully implicit for problems on rectangular domains in two dimensions. These two- 
dimensional domains should be viewed as cross sections of three-dimensional systems 
that are infinite and homogeneous in the direction of the magnetic field, which is 
orthogonal to the plane of the cross section. The algorithms are scalable in a multi- 
processing environment and generalize to three dimensions. We evaluate the perfor- 
mance of each algorithm on the same benchmark problem, namely, the equilibration 
of a vortex configuration in a system consisting of a superconducting core embedded 
in a blanket of insulating material (air) and undergoing a transition from the Meissner 
state to the vortex state under the influence of an externally applied magnetic field. 
We determine the maximum allowable time step for stability, the number of time 
steps needed to reach the equilibrium configuration, and the CPU cost per time step. 

Different algorithms correspond to different dynamics through state space, so the 
eventual equilibrium vortex configuration may differ from one algorithm to another. 
Hence, once we have the equilibrium configurations, we need some measure to assess 
their accuracy. For this purpose we use three parameters: the number of vortices, 
the mean intervortex distance (bond length), and the mean bond angle taken over 
nearest-neighbor pairs of bonds. When each of these parameters differs less than a 
specified tolerance, we say that the corresponding vortex configurations are the same. 

Our investigations show that one can increase the time step by almost two orders 
of magnitude, without losing stability, by going from the fully explicit to the fully 
implicit algorithm. The fully implicit algorithm has a higher cost per time step, but 
the wall clock time needed to compute the equilibrium solution (the most important 
measure for practical purposes) is still significantly less. The wall clock time can be 
reduced further by using a multi-timestepping procedure. 

In §|[ we present the Ginzburg-Landau model of superconductivity, first in its 
formulation as a system of partial differential equations, then as a system of ordinary 
differential equations after the spatial derivatives have been approximated by finite 
differences. In we give four algorithms to integrate the system of ordinary equa- 
tions: a fully explicit, a semi-implit, an implicit, and a fully implicit algorithm. In 
we present and evaluate the results of the investigation. In we further evaluate the 
fully implicit algorithm from the point of view of parallelism and multi-timestepping. 
The conclusions are summarized in §|[ 

2. Ginzburg-Landau Model. The time-dependent Ginzburg-Landau (TDGL)| 
equations of superconductivity ||, ^| are two coupled partial differential equations 
for the complex- valued order parameter tjj = \ip\e 1 ^ and the real vector- valued vector 
potential A, 

Zm s \ t c J 

— V x V x A + J s . 

An 

Here, J s is the supercurrent density, which is a nonlinear function of "0 and A, 

(2.3) J s = - W*) - ^-|VfA = (ftV0 - -A) . 

2im s m s c m s \ c ) 

The real scalar-valued electric potential $ is a diagnostic variable. The constants in 
the equations are h, Planck's constant divided by 2n; a and 6, two positive constants; 



x h 2 / d ie s \ , 

(2.1) + _£$ ,/, = _ 

V ' 2m s D \dt h ) V 
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c, the speed of light; m s and e s , the effective mass and charge, respectively, of the 
superconducting charge carriers (Cooper pairs); v, the electrical conductivity; and 
D, the diffusion coefficient. As usual, i is the imaginary unit, and * denotes complex 
conjugation. 

The quantity \ip\ 2 represents the local density of Cooper pairs. The local time 
rate of change dtA of A determines the electric field, E = (l/c)9(A + V$, its spatial 
variation the (induced) magnetic field, B = V x A. 

The TDGL equations describe the gradient flow for the Ginzburg-Landau energy 
functional. This functional is zero in the normal state, when tp — and the externally 
applied magnetic field penetrates the superconductor everywhere, V x A = H. In the 
superconducting state, it is given by the expression 

2 



(2.4) E = 



1 



2m, 



au 



dx. 



The three terms represent the kinetic energy, the condensation energy, and the field 
energy, respectively. A thermodynamic equilibrium configuration corresponds to a 
critical point of E. 



The energy functional (2.4) assumes that there are no defects in the supercon- 



ductor. Material defects can be naturally present or artificially induced and can be in 
the form of point, planar, or columnar defects (quenched disorder). A material defect 
results in a local reduction of the depth of the well of the condensation energy. A 
simple way to include material defects in the Ginzburg-Landau model is by assuming 
that the parameter a depends on position and has a smaller value wherever a defect 
is present. 

2.1. Dimensionless Form. Let ip 2 ^ = a/b, and let A, £, and H c denote the 
London penetration depth, the coherence length, and the thermodynamic critical 
field, respectively, 

(2.5) 



A 



m s c 



1/2 



2m, a 



1/2 



ML) 1/2 



In this study, we render the TDGL equations dimensionless by measuring lengths in 
units of £, time in units of the relaxation time £ 2 / D, fields in units of H c ^/2, and 
energy densities in units of (l/Air)H 2 



(2.6) 

(2.7) 
where 
(2.8) 



fdA 



The nondimcnsional TDGL equations are 

. x 2 

V--A) i> + rip - |v>| 2 V>, 

K J 

-V x V x A + J s , 



1 



= — (V>*VV> 



f 



Vcp- -A 

K 



W)-^|^A = ~M' 

Here, n = A/£ is the Ginzburg-Landau parameter and a is a dimensionless resistivity, 
a = (AttD /c 2 )v. The coefficient r has been inserted to account for defects; t(x) < 1 if 
x is in a defective region; otherwise r(x) = I. The nondimensional TDGL equations 
are associated with the dimensionless energy functional 

2 



(2.9) E 



- r |Vf + iM 4 )+|Vx A-H| : 



dx. 
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2.2. Gauge Choice. The (nondimensional) TDGL equations are invariant un- 
der a gauge transformation, 



(2.10) 



G x : (V, A, $) i ► tye* A + nV X , $ - 



Here, x can be any real scalar-valued function of position and time. We maintain the 
zero-electric potential gauge, $ = 0, at all times, using the link variable U, 



(2.11) 



U = exp 



This definition is componentwise: U x = cxp(— in 1 J x A x (x', y, z) dx'), .... The 
gauged TDGL equations can now be written in the form 



(2.12) 

(2.13) 
where 
(2.14) 



dt 



'dp? 



OA 

a— = -V x V x A + J s 
ot 



Js 



Im 



M = x,y,z. 



2.3. Two-Dimensional Problems. From here on we restrict the discussion to 
problems on a two-dimensional rectangular domain (coordinates x and y), assuming 
boundedness in the x direction and periodicity in the y direction. The domain rep- 
resents a superconducting core surrounded by a blanket of insulating material (air) 
or a normal metal. The order parameter vanishes outside the superconductor, and 
no superconducting charge carriers leave the superconductor. The whole system is 
driven by a time-independent externally applied magnetic field H that is parallel to 
the z axis, H = (0, 0, H). The vector potential and the supercurrent have two nonzero 
components, A = (A x ,A y ,0) and J;, = (J x ,J y ,0), while the magnetic field has only 
one nonzero component, B = (0, 0, B), where B = d x A y — d y A x . 

2.4. Spatial Discretization. The physical configuration to be modeled (su- 
perconductor embedded in blanket material) is periodic in y and bounded in x. In 
the x direction, we distinguish three subdomains: an interior subdomain occupied 
by the superconducting material and two subdomains, one on either side, occupied 
by the blanket material. We take the two blanket layers to be equally thick, but do 
not assume that the problem is symmetric around the midplane. (Possible sources 
of asymmetry are material defects in the system, surface currents, and different field 
strengths on the two outer surfaces.) 

We impose a regular grid with mesh widths h x and h yi 

(2.15) ttij = {xi,x i+1 ) x (yj,y j+1 ), x t = x + ih x ; y 3 = y +jh y , 



assuming the following correspondences: 

Left outer surface: x = xq 
Left interface: x = x Ui 
Right interface: 



2 li x 



2 



Right outer surface: x — x Ux + \h x , i 



l = u, 
= n x . 
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One period in the y direction is covered by the points j = 1, . . . ,n y . We use the 
symbols Sc and Bl to denote the index sets for the superconducting and blanket 
region, respectively, 

(2.16) Sc = {(i,j) : G [n sx ,n ex ] x [l,n„]}, 

(2.17) Bl = {(i,j) : (t,i) G [l,n s:E - 1] U [n ex + l,n x ] X [1,%]}. 

The order parameter -0 is evaluated at the grid vertices, 

(2.18) ipij = i>(xi,yj), G Sc, 

the components A x and A y of the vector potential at the midpoints of the respective 
edges, 

(2.19) A xii ,j =A x (xi + \h x ,Vj), A y . tJ = A y {x il y j + \h y ), (i,j)eScUBl, 
and the induced magnetic field B at the center of a grid cell, 



(2.20) B hJ = B(xt + \h x ,y 3 + \h y ) 

-^y;i-\-l,j ~~ Ay',i,j ^xyi,j-\-l 



A 



XVI, J 



^ (i,i)eScUBl, 



see Fig. 2.1. The values of the link variables and the supercurrent are computed from 




Fig. 2.1. Computational cell with evaluation points for if), A x , and A y . 

the expressions 



U X ; 



1,3 



(2.21) 

(2.22) J x;i j = -r-Im [ipljUxiijipi+ij], Jy,i,j = — j— Im [V>i,j^y;i,i'0i,j+l] 



The discrctizcd TDGL equations are 



(2.23) = (L ra (^, J 0V',^ l + (^(^;v)^v) J +^(^), e Sc, 



(2.24) cr- 

(2.25) cr 



li 

~~dT 

dt 



(D yy A x -i^.)- (D yx A y ; 



Jx;i,j, (i,j)GScUBl, 



(D xx A y .. J ) i - (D xy A x .. t .) i . + J y . i j. G Sc J HI. 



G 
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where 








(2.26) 


(Lxx(Ux;.,j)lp;j)i 


= h x 2 [U x . tit jipi+i t j - 2ipi tj 


+ ^VijV^-ij] , 


(2.27) 


(Lyy(U y;it .)'lpi l .) j 


= Kj 2 [Uy-i.jlpi.j + l - 2tpij 




(2.28) 




= n,ji>i,j - \^i,j\ 2 tpz,j, 




(2.29) 


(DyyA X -i 7 .) ■ 


= hy [Ax-ij+i — 2A x -i.j 4- 




(2.30) 


(D XX Ay;.j) i 


= h x 2 [Ay-i+ij — 2A y - ! i t j + 


Ay;i-l.j\ , 


(2.31) 


(DyxAy-.^^j 


— h x 1 [(Ay- i + 1 J — Ay-i 


j) ~ (A/;<+i,3'-i - 


(2.32) 


(DxyAx-.^^j 


— h x 1 h y 1 [(Ax-ij+i — A x -.i 


■j) ~ {A X ;i-l,j+l — Ax;i-\. 3 



The interface conditions are 

(2.33) i> nsx -l,j = U X ;n ax -l,j1pn sal ,j, = U x;n ex ,3 J ' J = 1j • • • >%• 

At the outer boundary, _B is given, 

(2.34) B 0J = //, , /A,,., = H R . , j = l,... ,n y . 

The resulting approximation is second-order accurate . 

3. Time Integration. We now address the integration of Eqs. ( 2.23| )-( [2.25 ). 

The first equation, which controls the evolution of ip, involves the second-order linear 
finite-difference operators L xx arid L yy , whose coefficients depend on A x and A y , 
and the local nonlinear operator N, which involves neither A x nor A y . Each of the 
other two equations, which control the evolution of A x and A y respectively, involves 
likewise a second-order linear finite-difference operator, but with constant coefficients, 
and the nonlinear supercurrent operator, which involves ip, A x , and A y . The following 
algorithms are distinguished by whether the various operators are treated explicitly 
or implicitly. 

3.1. Fully Explicit Integration. Algorithm I uses a fully explicit forward 
Euler time-marching procedure for ip, A x , and A y . Starting from an initial triple 
A x , A°), we solve for n = 0, 1, . . . , 

(3 1} ^ = (LxxiUS,^), + (L yy (U^.)^.). + N (^.) , (ij) e Sc, 

A n+1 — A n 

(3.2) a *™ *Jhi = {l), w .V; , ) - (DyxAy.. .). + J^, e ScU Bl, 



At 

A n+1 — A n 

(3.3) a vM_ mi = (DxxAlJ. - (Dx V A n x ..,). d + J£ y . (i, j) € ScU Bl, 



where J™ is defined in terms of ip n , A™ , and Ay in the obvious way. The initial triple 
is usually chosen so the superconductor is in the Meissner state, with a seed present 
to trigger the transition to the vortex state. 

Algorithm 1 has been described in Jl0| | . It has been implemented in a distributed- 
memory multiprocessor environment (IBM SP2); the transformations necessary to 
achieve the parallelism have been described in jll| . The code uses the Message Passing 
Interface (MPI) standard (l2) as implemented in the MPICH software library |H| for 
domain decomposition, interprocessor communication, and file I/O. The code has 
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been used extensively to study vortex dynamics in superconducting media fj| |[ 
The underlying algorithm provides highly accurate solutions but requires a significant 
number of time steps for equilibration. For stability reasons, the time step At cannot 
exceed 0.0025. 

3.2. Semi-Implicit Integration. Algorithm II is generated by an implicit treat-| 
mcnt of the second-order linear hnite-difference operators D yy and D xx in the equa- 
tions for A x and A y , respectively, 

(3 4) ^ = (WE£..j)^) . + (LyviKi^Dj + N tyij) > e Sc ' 

A n+1 — A n 

(3.5) a SM_ ?M = (D yy A^.), - {D yx AlJ. t . + 6 Sc U Bl, 

A n+1 — A n 

( 3. 6) a ^ = (^A™^ 1 ). - (A^A£. V ). . + (i, i) € ScU Bl. 
Equations fl3.5j) and (|3.6|) lead to two linear systems of equations, 



At 

(7 

At 



(3.7) [l-—D yy )Alf = Fi^ n ,A n x ,A^), i = 1, . . . ,n x - 1, 



(J 



(3.8) I -f — —D xx J A™^ 1 = Gj(tp n , A™, A"), j = l,..., % , 



for the vectors of unknowns A a; j = {A x -ij : j = 1, . . . ,n y } and A^y- = {A y ^_j : i = 
1, . . . ,71a; — 1}. The matrix D yy has dimension rt y x n y and is periodic tridiagonal 
with elements —h~ 2 , 2/i~ 2 , —h y 2 ; the matrix Z?^ has dimension (n x — 1) x (rtj, — 1) 
and is tridiagonal with elements ~h x 2 , 2h x 2 , —h x 2 , (except along the edges, because 
of the boundary conditions). Both matrices are independent of i and j. Furthermore, 
if the boundary conditions are time independent, they are cons tant throu gho ut the 



time-stepping process. Hence, the coefficient matrices in Eqs. (3/7) and (3.8) need 
to be factored only once; in fact, the factorization can be done in the preprocessing 
stage and the factors can be stored. 

In a parallel processing environment, the coefficient matrices extend over several 



processors, so Eqs. (3.7) and (3.£) are broken up in blocks corresponding to the manner 
in which the computational mesh is distributed among the processor set. We first solve 
the equations within each processor (inner iterations) and then couple the solutions 
across processor boundaries (outer iterations). Hence, we deal with interprocessor 
coupling in an iterative fashion. Two to three inner iterations usually suffice to reach 
a desired tolerance for convergence. After each inner iteration, each processor shares 
boundary data with its neighbors through MPI calls. 

3.3. Implicit Integration. Algorithm III combines the semi-implicit treatment 
of A x and A y with an implicit treatment of the order parameter, 



(3 9) ^ ^ ^ = ^ xx(JJ n . )r +i^ + ( Lyy (U^l +1 ). + N (^.) , 
A n+1 — A" 

(3.10)7 x > id At x ™ = {D m AZg). - {D yx A n y , t ). d + J« y , e Sc u Bl, 

A n+1 — A™ 

(3.11V ^ = (D xx A^) i - {D xy Al^). . + J- id . (/.,: G ScU Bl. 



e Sc, 
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The second and third equation are solved as in the semi-implicit algorithm of the 
preceding section. The first equation is solved by a method similar to the method of 
Douglas and Gunn E3] for the Laplacian. 



We begin by transforming Eq. (3.9) into an equation for the correction matrix 
t> n+1 = ip n+1 — tjj n . The equation has the general form 



(3.12) 



(/ - At(L xx + L yy )) r +L = F{r,A n x , Ay) 



If At is sufficiently small, we may replace the operator in the left member by an 
approximate factorization, 

(3.13) (7 - At(L xx + Lyy)) « (I - AtL xx ) (I - AtL yy ) , 

and consider, instead of Eq. ( |3.12| ) , 



(3.14) 



(I - AtL XX ) (I - AtLyy) 



F(r,A n x ,A"). 



This equation can be solved in two steps, 



(3.15) 
(3.16) 



(/ - AtL xx ) <p = F, 
ip. 



(I - AtLyy) 



The conditions (2.33), which must be satisfied at the interface between the supercon- 
ductor and the blanket material, require some care. If we impose the conditions at 
every time step, then 



in+l 



in+1 _ (Tjn+1 \* An+1 



u x;n sx -l,j'rn slc ,j < [ u x;n 3x -l,j u x;n sx -l,3 J Vn sx ,ji 

(U n+1 )* - (U n V 

V x;n ex ,3 J \ x:n sx — l.j / 



for j = 1, . . . ,n y . These conditions couple the correction <f> to the update of A x . To 
eliminate this coupling, we solve Eq. ( 3.1 2| ) subject to the reduced interface conditions 



(3.17) 
(3.18) 



1 — JJ n+1 d> n+1 i — 1 n 

-1,3 - U x;n sa ,-l,j ( P n , x ,j^ J J-j • • • j n y , 

- 1 - (U n+1 )* 6 n+1 1 = 1 n 



When Eq. (3.12) is replaced by Eq. (3.14), these conditions are inherited by the 
system (3.15). 



3.4. Fully Implicit Integration. Algorithm IV uses a fully implicit integration 
procedure for the order parameter, 

(3.19) ^ - ^ = {L xx (U2.. J W$ 1 ) i + (LyyiU^.W?; 1 ). + N (^ +1 ) , (i,j) G 
A n+1 — A n 

(3.20) 7 = {DyyA n + 1 ) j - (/) v ,.l!; : ...),, + J n x . Ad , (i, j) G Sc U Bl, 

A n+1 — A n 

(3.21>7 ViiJ . = (/),,.!': ') (D„„AZ. 



At 



J£. id . (i,j)eScUBL 



The new element here is the term N (ip^ 1 ) in the first equation. 
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The second and third equations are solved again as in the semi-implicit algorithm. 
The first equation is solved by a slight modification of the method used in the im- 
plicit algorithm of the preceding section, The modification is brought about by the 
approximation 

(3.22) N (ip n+1 ) = Tijj n+1 - \ip n+1 \ 2 ip n+1 ~-^- t (S (tp n ) - ip n ) , 



where S is a nonlinear map, 



(3.23) S{ip) 



[H 2 + (t- |^| 2 )exp(-2rAi)] 



1/2- 



(This approximation is explained in the remark below.) Equation (3.19) is again of 
the form (3.12), but with a different right-hand side, 

(3.24) (I - At(L xx + L yy M l+1 = G(if> n , A n xl A n y ). 

The difference is that, where F in Eq. ( 3.1 2| ) contains a term (At)N (ip n ), G in 
Eq. ( |3.24 ) contains the more complicated term S (ip n ) — ip n . 

Remark. The approximation (3.22) is suggested by semigroup theory. Symboli- 
cally, 

(3.25) N(i>) = lim 5(A y~^ . 
v ' v ' At^O At 

To find an expression for the "semigroup" S, we start from the continuous TDGL 



equations (J2 . 6|) (2.8) (zero-electric potential gauge, $ = 0), using the polar represen- 
tation ip = |^| e^, 

(3.26) d t \i>\ = A|^| - |V||V0 - K - X A| 2 + rM - M 3 , 

(3.27) \ip\d t (j) = 2(V|-0|) • (V0 - kT x A) + \ip\V ■ (V0 - Kf^A), 

(3.28) ad t A = -V x V x A + k' 1 ]^ 2 ^^ - k _1 A). 

At this point, we are interested in the effect of the nonlinear term |^| 3 on the dynamics. 
To highlight this effect, we concentrate on the time evolution of the scalar u = \ip\ and 
the vector v = V</> — k _1 A. (In physical terms, u 2 is the density of superconducting 
charge carriers, while u 2 v is k times the supercurrent density.) Ignoring their spatial 
variations, we have a dynamical system, 

(3.29) u = -u\v\ 2 + tu - u 3 , 

(3.30) v' = -eu 2 v, 

where ' denotes differentiation with respect to t, and e = (k 2 ct) _1 . This system yields 
a pair of ordinary differential equations for the scalars x = u 2 and y = \v\ 2 , 

(3.31) x' = 2x{r-x-y), 

(3.32) y' = -2exy. 

If k, is large, e is small, and the dynamics are readily analyzed. To leading order, y 
is constant; y = is the only meaningful choice. (Recall that xy 1 / 2 is k times the 
magnitude of the supercurrent density.) Then the dynamics of x are given by 

(3.33) x'=2x(t-x). 



10 



D. O. GUNTER, H. G. KAPER, AND G. K. LEAF 



We integrate this equation from t = t n to t, 

(3-34) x(t) = — -. ^ftn) . - , r-. 

K ' x(t n ) + (r - x(t n )) exp(-2r(t - t n )) 

In particular, 

(3.35) x(t n+1 ) - TX{tn) 



x(t n ) + (r - x(t n )) exp(-2rAi) 
where At = t n+1 - t n . Since x(t n ) = |^ n | 1/2 and x(t n+1 ) = \ip n+1 \ 1 / 2 , it follows that 



(3.36) |V 



n+l,_ ^ /2 \r\ 



[\ip n \ 2 + (r - |V"| 2 ) exp(-2rAt)] 1 /2 ' 
The phase (j> aiip is constant in time. If we multiply both sides by e l< ^, we obtain the 



expression (3.23) for the "semigroup" S. 



4. Evaluation. We now present the results of several experiments, where the 
algorithms described in the preceding section were applied to a benchmark problem. 

4.1. Benchmark Problem. The benchmark problem adopted for this investi- 
gation is the equilibration of a vortex configuration in a homogeneous superconductor 
without defects (k = 16, a = 1, r = 1) embedded in a thin insulator (air), where the 
entire system is periodic in the direction of the free surfaces (y). 

The superconductor measures 128£ in the transverse (x) direction. The thickness 
of the insulating layer on cither side is taken to be 2£, so the width of the entire 
system is 132£. The period in the y direction is taken to be 192£, so the entire system 
measures 132£ x 192£. 

The computational grid is uniform, with a mesh width h x — h y = i£. The 
periodic boundary conditions in the y direction are handled through ghost points, so 
the computational grid has 264 x 386 vertices. The index sets for the superconductor 
and blanket (see Eqs. ( |2.16 ) and (2.17)) are 



(4.1) Sc = {(i,j) :i = 5,... ,260, j = 1, . . . ,386}, 

(4.2) Bl = {(i,i):i = l,... ,4,261,... ,264, j = 1, . . . , 386}. 

The applied field is uniform, 

(4.3) H L =H R = H = 0.5. 

(Units of H are H c y/2, so H sa 0.707 . . . H c ). As there is no transport current in the 
system, the solution of the TDGL equations tends to an equilibrium state. 

4.2. Benchmark Solution. First, preliminary runs were made to determine, for 
each algorithm, the optimal number of processors in a multiprocessing environment. 



Figure 4.1 shows the wall clock time for 50 time steps against the number of processors 
on the IBM SP2. Each algorithm shows a saturation around 16 processors, beyond 
which any improvement becomes marginal. All problems were subsequently run on 
16 processors. 

Next, we used the fully explicit Alg or ithm I to establish a benchmark equilibrium 



configuration. We integrated Eqs. (3.1)— (3.2 ) with a time step At = 0.0025 (units of 
£ 2 /D), the maximal value for which the algorithm remained stable, and followed the 
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Fig. 4.1. Elapsed time for 50 time steps as a function of the number of processors. 



evolution of the vortex configuration by monitoring the number of vortices and their 
positions. Equilibrium was reached after 10,000,000 time steps, when the number 
of vortices remained constant and the vortex positions varied less than 1.0 x 10~ 6 
(units of £). The equilibrium vortex configuration had 116 vortices arranged in a 
hexagonal pattern; see Fig. [h^. The wall clock time for the entire computation was 
approximately 3,000 minutes. The elapsed time per time step (0.018 seconds) is a 
measure for the computational cost of Algorithm I. 

4.3. Evaluation of Algorithms II— IV. With the benchmark solution in place, 
we evaluated each of the remaining algorithms (II— IV) for stability, accuracy, and 
computational cost. 

We found the stability limit in the obvious way, gradually increasing the time 
step and integrating to equilibrium until arithmetic divergences caused the algorithm 
to fail. Equilibrium was defined by the same criteria as for the benchmark solution: 
no change in the number of vortices and a variation in the vortex positions of less 
than 1.0 x 10" 6 . 

Because each algorithm defines its own path through phase space, one cannot 
expect to find identical equilibrium configurations nor equilibrium configurations that 
are exactly the same as the benchmark. The equilibrium vortex configurations for 
the four algorithms were indeed different, albeit slightly. To measure the differences 
quantitatively, we computed the following three parameters: (i) the number of vortices 
in the superconducting region, (ii) the mean bond length joining neighboring pairs of 
vortices, and (iii) the mean bond angle subtended by neighboring bonds throughout 
the vortex lattice. In all cases, the number of vortices was the same (116); the mean 
bond length varied less than 1.0 x 10 -3 £, and the mean bond angle varied by less than 
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Fig. 4.2. Equilibrium vortex configuration for the benchmark problem. 

l.Ox 10~ 3 radians. Within these tolerances, the equilibrium vortex configurations were 
the same. 



The results are given in Table 4.1; At is the time step at the stability limit (units 



Table 4.1 
Performance data for Algorithms I-IV. 



Algorithm 


At 


N 


C 


T 


I 


0.0025 


10,000,000 


0.018 


3,000 


II 


0.0500 


500,000 


0.103 


858 


III 


0.1000 


250,000 


0.232 


967 


IV 


0.1900 


100,000 


0.233 


388 



of £ 2 / D), N the number of time steps needed to reach equilibrium, and C the cost 
of the algorithm (seconds per time step). From these data we obtain the wall clock 
time needed to compute the equilibrium configuration, T — NC/60 (minutes). 

Note that the existence of a stability limit for Algorithm IV is a consequence of 
the implementation in a multiprocessing environment. Since we restrict interprocessor 
communication to the end of each time step, the fully implicit character of the algo- 
rithm is lost. On a single processor, Algorithm IV is fully implicit, and the stability 
limit is infinite. 
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5. Further Evaluation of Algorithm IV. We evaluated the fully implicit Al- 
gorithm IV in more detail by considering its speedup in a multiprocessing environment 
and its performance under a multi-timestepping procedure. 

5.1. Parallelism. First, we investigated the speedup of Algorithm IV in a mul- 
tiprocessing environment, using the the benchmark problem and two other problems, 
obtained form the benchmark problem by twice doubling the size of the system in 
each direction. The mesh width was kept constant at so the resulting computa- 
tional grid had 264 x 386 vertices for the benchmark problem, 528 x 772 vertices for 
the intermediate problem, and 1056 x 1544 vertices for the largest problem. Speedup 
was defined as the ratio of the wall clock time (exclusive of I/O) to reach equilibrium 
on p processors divided by the time to reach equilibrium on a single processor for 
the benchmark and intermediate problem, or twice the time to reach equilibrium on 
two processors for the largest problem. (The largest problem did not fit on a single 
processor.) The results are given in Fig. 5.1. The curve for the benchmark problem 
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Fig. 5.1. Performance of Algorithm IV in a multiprocessing environment. 

was obtained as an average over many runs; the data for the intermediate and largest 
problem were obtained from single runs, hence they are less smooth. The speedup is 
clearly linear when the number of processors is small; it becomes sublinear at about 
12 processors for the smallest problem, 14 processors for the intermediate problem, 
and 18 processors for the largest problem. 

5.2. Multi-timestepping. The final set of experiments shows that the perfor- 
mance of Algorithm IV is enhanced by a multi-timestepping procedure, where A is 
updated less frequently than tp, 

{D yx A^.)., + J:. id , (i,j)eScUBL 

{D xy A^.). t .+J; ;i;i . (U)eScUBl. 

When m = 1, both tp and A are updated at every time step, but when m is greater 
than 1, A is updated only every mth time step. In the limit as m — > oo, this procedure 
yields the frozen-field approximation, which is a good approximation of the Ginzburg- 
Landau model near the upper critical field when the charge of the superconducting 
charge carriers is small [ p^| . 



(5.2)cr 



(5.3) ct 



/l ■ - 1 /I 

x;i,j x;i,j 

mAt 

An+m in 

y;i,j v;i,j 
mAt 



I' • ';/: ./ )i 
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We applied this modification of Algorithm IV with m = 10, 15 to the benchmark 
problem of §|]. The results are given in Table 5.1. All computations were done with 



Table 5.1 

Effect of update frequency on time and cost of Algorithm IV. 



m 


N 


C 


T 


1 


100,000 


0.2330 


388 


10 


200,000 


0.0707 


236 


15 


250,000 


0.0646 


270 



At — 0.19 (units of t; 2 /D). The data for the computation with m — 1 are taken 
from Table 4.1. We observe that the cost of the algorithm (C, seconds per time 
step) decreases with increasing to, while the number of time steps to equilibrium 
(N) increases. If to = 10, the overall wall-clock time (T = NC/60, minutes) is 
approximately one-third less than the wall-clock time for m = 1. For larger values 
of to, the increase in the number of steps needed to reach equilibrium offsets any 
gain from the decrease in cost. These data suggest an optimal strategy, where A is 



updated every 10 time steps. Figure 5.2 shows the effect of the updating frequency 



on the evolution of the free-energy functional (2.9). Note the dramatic increase of the 
wall-clock time for to = 15. Eventually, the curve for to — 15 merges with the other 
curves, but this happens well beyond the range of the figure. 




Fig. 5.2. Effect of the updating frequency on the evolution of the energy functional. 



6. Conclusions. The results of the investigation lead to the following conclu- 
sions. 

(i) One can increase the time step At nearly 80-fold, without losing stability, by 
going from the fully explicit Algorithm I to the fully implicit Algorithm IV. 

(ii) As one goes to the fully implicit Algorithm IV, the complexity of the matrix 
calculations and, hence, the cost C of a single time step increase. 
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(iii) The increase in the cost C per time step is more than offset by the increase 
in the size of the time step At. In fact, the wall clock time needed to compute the 
same equilibrium state with the fully implicit Algorithm IV is one-sixth of the wall 
clock time for the fully explicit Algorithm I. 

(iv) The (physical) time to reach equilibrium — that is, NAt, the number of time 
steps needed to reach equilibrium times the step size — is (approximately) the same 
for all algorithms, namely, 25,000 (units of t; 2 /D). 

(v) The fully implicit Algorithm IV displays linear speedup in a multiprocess- 
ing environment. The speedup curves show sublincar behavior when the number of 
processors is large. 

(vi) The performance of the fully implicit Algorithm IV can be improved further 
by a multi-timestepping procedure, where the vector potential A is updated less 
frequently than the order parameter ip. 
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